Multiple imputation analysis of case-cohort studies.

نویسندگان

  • Helena Marti
  • Michel Chavance
چکیده

The usual methods for analyzing case-cohort studies rely on sometimes not fully efficient weighted estimators. Multiple imputation might be a good alternative because it uses all the data available and approximates the maximum partial likelihood estimator. This method is based on the generation of several plausible complete data sets, taking into account uncertainty about missing values. When the imputation model is correctly defined, the multiple imputation estimator is asymptotically unbiased and its variance is correctly estimated. We show that a correct imputation model must be estimated from the fully observed data (cases and controls), using the case status among the explanatory variable. To validate the approach, we analyzed case-cohort studies first with completely simulated data and then with case-cohort data sampled from two real cohorts. The analyses of simulated data showed that, when the imputation model was correct, the multiple imputation estimator was unbiased and efficient. The observed gain in precision ranged from 8 to 37 per cent for phase-1 variables and from 5 to 19 per cent for the phase-2 variable. When the imputation model was misspecified, the multiple imputation estimator was still more efficient than the weighted estimators but it was also slightly biased. The analyses of case-cohort data sampled from complete cohorts showed that even when no strong predictor of the phase-2 variable was available, the multiple imputation was unbiased, as precised as the weighted estimator for the phase-2 variable and slightly more precise than the weighted estimators for the phase-1 variables. However, the multiple imputation estimator was found to be biased when, because of interaction terms, some coefficients of the imputation model had to be estimated from small samples. Multiple imputation is an efficient technique for analyzing case-cohort data. Practically, we suggest building the analysis model using only the case-cohort data and weighted estimators. Multiple imputation can eventually be used to reanalyze the data using the selected model in order to improve the precision of the results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)

Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...

متن کامل

چند رویکرد برخورد با مقادیر گمشده‌ متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی‌ بالینی

Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...

متن کامل

کاربرد جای گذاری چندگانه در تحقیقات پزشکی و اپیدمیولوژی

Data missing, which occurs for different reasons, is an unavoidable problem in epidemiological studies. It is quite widespread and, therefore, it is considered as a challenge in research design and data analysis by many methodologists. Complete case analysis is often used in studies with missing data however, this approach may result in inaccurate estimates and inferences due to bias associated...

متن کامل

Imputations of missing values in practice: results from imputations of serum cholesterol in 28 cohort studies.

Missing values, common in epidemiologic studies, are a major issue in obtaining valid estimates. Simulation studies have suggested that multiple imputation is an attractive method for imputing missing values, but it is relatively complex and requires specialized software. For each of 28 studies in the Asia Pacific Cohort Studies Collaboration, a comparison of eight imputation procedures (uncond...

متن کامل

Multiple imputation for estimating hazard ratios and predictive abilities in case-cohort surveys

BACKGROUND The weighted estimators generally used for analyzing case-cohort studies are not fully efficient and naive estimates of the predictive ability of a model from case-cohort data depend on the subcohort size. However, case-cohort studies represent a special type of incomplete data, and methods for analyzing incomplete data should be appropriate, in particular multiple imputation (MI). ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics in medicine

دوره 30 13  شماره 

صفحات  -

تاریخ انتشار 2011